Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
We present Prompt Diffusion, a framework for enabling in-context learning in diffusion-based generative models. Given a pair of task-specific example images, such as depth from/to image and scribble from/to image, and a text guidance, our model automatically understands the underlying task and performs the same task on a new query image following the text guidance. To achieve this, we propose a vision-language prompt that can model a wide range of vision-language tasks and a diffusion model that takes it as input. The diffusion model is trained jointly on six different tasks using these prompts. The resulting Prompt Diffusion model becomes the first diffusion-based vision-language foundation model capable of in-context learning. It demonstrates high-quality in-context generation for the trained tasks and effectively generalizes to new, unseen vision tasks using their respective prompts. Our model also shows compelling text-guided image editing results. Our framework aims to facilitate research into in-context learning for computer vision. We share our code and pre-trained models at https://github. com/Zhendong-Wang/Prompt-Diffusion.more » « less
-
null (Ed.)Despite achieving remarkable performance, deep graph learning models, such as node classification and network embedding, suffer from harassment caused by small adversarial perturbations. However, the vulnerability analysis of graph matching under adversarial attacks has not been fully investigated yet. This paper proposes an adversarial attack model with two novel attack techniques to perturb the graph structure and degrade the quality of deep graph matching: (1) a kernel density estimation approach is utilized to estimate and maximize node densities to derive imperceptible perturbations, by pushing attacked nodes to dense regions in two graphs, such that they are indistinguishable from many neighbors; and (2) a meta learning-based projected gradient descent method is developed to well choose attack starting points and to improve the search performance for producing effective perturbations. We evaluate the effectiveness of the attack model on real datasets and validate that the attacks can be transferable to other graph learning models.more » « less
-
Intelligent personal assistant systems, with either text-based or voice-based conversational interfaces, are becoming increasingly popular. Most previous research has used either retrieval-based or generation-based methods. Retrieval-based methods have the advantage of returning fluent and informative responses with great diversity. The retrieved responses are easier to control and explain. However, the response retrieval performance is limited by the size of the response repository. On the other hand, although generation-based methods can return highly coherent responses given conversation context, they are likely to return universal or general responses with insufficient ground knowledge information. In this paper, we build a hybrid neural conversation model with the capability of both response retrieval and generation, in order to combine the merits of these two types of methods. Experimental results on Twitter and Foursquare data show that the proposed model can outperform both retrieval-based methods and generation-based methods (including a recently proposed knowledge-grounded neural conversation model) under both automatic evaluation metrics and human evaluation. Our models and research findings provide new insights on how to integrate text retrieval and text generation models for building conversation systems.more » « less
An official website of the United States government

Full Text Available